Decision Tree-Based Acoustic Models for Speech Recognition with Improved Smoothness
نویسندگان
چکیده
This article proposes a new acoustic model using decision trees (DTs) as replacements for Gaussian mixture models (GMM) to compute the observation likelihoods for a given hidden Markov model state in a speech recognition system. DTs have a number of advantageous properties, such as that they do not impose restrictions on the number or types of features, and that they automatically perform feature selection. This article explores and exploits DTs for the purpose of large vocabulary speech recognition. Equal and decoding questions have newly been introduced into DTs to directly model genderand context-dependent acoustic space. Experimental results for the 5k ARPA wall-street-journal task show that context information significantly improves the performance of DT-based acoustic models as expected. Context-dependent DT-based models are highly compact compared to conventional GMM-based acoustic models. This means that the proposed models have effective data-sharing across various context classes.
منابع مشابه
Class-triphone Acoustic Modeling Based on Decision Tree for Mandarin Continuous Speech Recognition
Decision tree based acoustic modeling has increasingly become popular for modeling speech spectral variations in continuous speech. In this paper, class-triphone acoustic models based on the decision tree are investigated for mandarin speakerindependent continuous speech recognition. Three main questions are discussed: how to select base phone models, how to generate the question set based on l...
متن کاملA study of bootstrapping with multiple acoustic features for improved automatic speech recognition
This paper investigates a scheme of bootstrapping with multiple acoustic features (MFCC, PLP and LPCC) to improve the overall performance of automatic speech recognition. In this scheme, a Gaussian mixture distribution is estimated for each type of feature resampled in each HMM state by single-pass retraining on a shared decision tree. Thus obtained acoustic models based on the multiple feature...
متن کاملHigh accuracy acoustic modeling based on multi-stage decision tree
In many continuous speech recognition systems based on HMMs, decision tree-based state tying has been used for not only improving the robustness and accuracy of context dependent acoustic modeling but also synthesizing unseen models. To construct the phonetic decision tree, standard method has used just single Gaussian triphone models to cluster states. The coarse clusters generated using just ...
متن کاملConversion from phoneme based to grapheme based acoustic models for speech recognition
This paper focuses on acoustic modeling in speech recognition. A novel approach how to build grapheme based acoustic models with conversion from existing phoneme based acoustic models is proposed. The grapheme based acoustic models are created as weighted sum from monophone acoustic models. The influence of particular monophone is determined with the phoneme to grapheme confusion matrix. Furthe...
متن کاملشبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEICE Transactions
دوره 94-D شماره
صفحات -
تاریخ انتشار 2011